SFTP Bulk

Adding SFTP Bulk as data source

Prerequisite for adding SFTP Bulk

The following connector information is required from the client:

Username
Password
Host
Port
File Type
Stream Name
Folder Path
CSV Separator
Start Date

Do the following:

Login to a SFTP server using your credentials.
Create a folder in the server and drop your files there.

To add SFTP Bulk as data source, do the following:

From the left navigation panel, click Lakehouse and then click Data Sources.
From the upper right corner of the page, click the + New Database button to start the process of adding a new database.
In the New Data Source page, click the SFTP icon.

Specify the following details to add SFTP Bulk. Once you have connected a data source, the system immediately fetches its schema. After this schema retrieval process is complete you can browse and interact with the tables and data.

Field	Description
Connection Name	Enter a unique name for the connection.
File Type	Currently, only CSV files are supported.
Username / Password	Specify the client credentials.
Host Address	Specify the SFTP server address.
Port	Specify the port number of the SFTP server.
Stream Name	Enter name of the output table you want to create. Specify the desired name for the data stream (table) in the destination warehouse. This can be any name and is independent of the actual CSV file names. Sync modes (incremental/full refresh) are configured at the stream level, not at the pipeline level.
Folder Path	Provide the absolute path to the folder on the SFTP server containing the CSV files (e.g., `/home/Ubuntu/SFTP/credit`). Ensure this path is accurate.
Start Date	Specify the date from which to begin replicating data. This allows for historical data selection.
CSV separator	Specify the delimiter used in the CSV files (comma is the default). Other separators like spaces can also be configured.

Click Submit.

Supported Sync modes

Full Refresh | Overwrite
Full Refresh | Append
Sync Incremental | Append

Supported Streams

This source provides a single stream per file with a dynamic schema. The current supported type files are Avro, CSV, JSONL, Parquet, and Document File Type Format.

Was this helpful?

Adding SFTP Bulk as data source​

Supported Sync modes​

Supported Streams​

Adding SFTP Bulk as data source

Supported Sync modes

Supported Streams